Search | WHO COVID-19 Research Database

Claim Extraction and Law Matching for COVID-19-related Legislation

Dehio, N.; Ostendorff, M.; Rehm, G..

Lrec 2022: Thirteen International Conference on Language Resources and Evaluation ; : 480-490, 2022.

Article in English | Web of Science | ID: covidwho-2308155

ABSTRACT

To cope with the COVID-19 pandemic, many jurisdictions have introduced new or altered existing legislation. Even though these new rules are often communicated to the public in news articles, it remains challenging for laypersons to learn about what is currently allowed or forbidden since news articles typically do not reference underlying laws. We investigate an automated approach to extract legal claims from news articles and to match the claims with their corresponding applicable laws. We examine the feasibility of the two tasks concerning claims about COVID-19-related laws from Berlin, Germany. For both tasks, we create and make publicly available the data sets and report the results of initial experiments. We obtain promising results with Transformer-based models that achieve 46.7 F1 for claim extraction and 91.4 F1 for law matching, albeit with some conceptual limitations. Furthermore, we discuss challenges of current machine learning approaches for legal language processing and their ability for complex legal reasoning tasks.

User Experience Design for Automatic Credibility Assessment of News Content About COVID-19

Schulz, K.; Rauenbusch, J.; Fillies, J.; Rutenburg, L.; Karvelas, D.; Rehm, G..

24th International Conference on Human-Computer Interaction, HCII 2022 ; 13517 LNCS:142-165, 2022.

Article in English | Scopus | ID: covidwho-2173838

ABSTRACT

The increasingly rapid spread of information about COVID-19 on the web calls for automatic measures of credibility assessment [18]. If large parts of the population are expected to act responsibly during a pandemic, they need information that can be trusted [20]. In that context, we model the credibility of texts using 25 linguistic phenomena, such as spelling, sentiment and lexical diversity. We integrate these measures in a graphical interface and present two empirical studies to evaluate its usability for credibility assessment on COVID-19 news. Raw data for the studies, including all questions and responses, has been made available to the public using an open license: https://github.com/konstantinschulz/credible-covid-ux. The user interface prominently features three sub-scores and an aggregation for a quick overview. Besides, metadata about the concept, authorship and infrastructure of the underlying algorithm is provided explicitly. Our working definition of credibility is operationalized through the terms of trustworthiness, understandability, transparency, and relevance. Each of them builds on well-established scientific notions [41, 65, 68] and is explained orally or through Likert scales. In a moderated qualitative interview with six participants, we introduce information transparency for news about COVID-19 as the general goal of a prototypical platform, accessible through an interface in the form of a wireframe [43]. The participants' answers are transcribed in excerpts. Then, we triangulate inductive and deductive coding methods [19] to analyze their content. As a result, we identify rating scale, sub-criteria and algorithm authorship as important predictors of the usability. In a subsequent quantitative online survey, we present a questionnaire with wireframes to 50 crowdworkers. The question formats include Likert scales, multiple choice and open-ended types. This way, we aim to strike a balance between the known strengths and weaknesses of open vs. closed questions [11]. The answers reveal a conflict between transparency and conciseness in the interface design: Users tend to ask for more information, but do not necessarily make explicit use of it when given. This discrepancy is influenced by capacity constraints of the human working memory [38]. Moreover, a perceived hierarchy of metadata becomes apparent: the authorship of a news text is more important than the authorship of the algorithm used to assess its credibility. From the first to the second study, we notice an improved usability of the aggregated credibility score's scale. That change is due to the conceptual introduction before seeing the actual interface, as well as the simplified binary indicators with direct visual support. Sub-scores need to be handled similarly if they are supposed to contribute meaningfully to the overall credibility assessment. By integrating detailed information about the employed algorithm, we are able to dissipate the users' doubts about its anonymity and possible hidden agendas. However, the overall transparency can only be increased if other more important factors, like the source of the news article, are provided as well. Knowledge about this interaction enables software designers to build useful prototypes with a strong focus on the most important elements of credibility: source of text and algorithm, as well as distribution and composition of algorithm. All in all, the understandability of our interface was rated as acceptable (78% of responses being neutral or positive), while transparency (70%) and relevance (72%) still lag behind. This discrepancy is closely related to the missing article metadata and more meaningful visually supported explanations of credibility sub-scores. The insights from our studies lead to a better understanding of the amount, sequence and relation of information that needs to be provided in interfaces for credibility assessment. In particular, our integration of software metadata contributes to the more holistic notion of credibility [47, 72] that has become popular in recent years Besides, it paves the way for a more thoroughly informed interaction between humans and machine-generated assessments, anticipating the users' doubts and concerns [39] in early stages of the software design process [37]. Finally, we make suggestions for future research, such as proactively documenting credibility-related metadata for Natural Language Processing and Language Technology services and establishing an explicit hierarchical taxonomy of usability predictors for automatic credibility assessment. © 2022, Springer Nature Switzerland AG.

Claim Extraction and Law Matching for COVID-19-related Legislation

Dehio, N.; Ostendorff, M.; Rehm, G..

13th International Conference on Language Resources and Evaluation Conference, LREC 2022 ; : 480-490, 2022.

Article in English | Scopus | ID: covidwho-2168054

ABSTRACT

Suspicious Sentence Detection and Claim Verification in the COVID-19 Domain

Pankovska, E.; Schulz, K.; Rehm, G..

2nd Workshop Reducing Online Misinformation through Credible Information Retrieval, ROMCIR 2022 ; 3138:27-47, 2022.

Article in English | Scopus | ID: covidwho-1871513

ABSTRACT

The processing, identification and fact checking of online information has received a lot of attention recently. One of the challenges is that scandalous or "blown up"news tend to become viral, even when coming from unreliable sources. Particularly during a global pandemic, it is crucial to find efficient ways of determining the credibility of information. Fact-checking initiatives such as Snopes, FactCheck.org etc., perform manual claim validation but they are unable to cover all suspicious claims that can be found online - they focus mainly on the ones that have gone viral. Similarly, for the general user it is also impossible to fact-check every single statement on a specific topic. While a lot of research has been carried out in both claim verification and fact-check-worthiness, little work has been done so far on the detection and extraction of dubious claims, combined with fact-checking them using external knowledge bases, especially in the COVID-19 domain. Our approach involves a two-step claim verification procedure consisting of a fake news detection task in the form of binary sequence classification and fact-checking using the Google Fact Check Tools. We primarily work on medium-sized documents in the English language. Our prototype is able to recognize, on a higher level, the nature of fake news, even hidden in a text that seems credible at first glance. This way we can alert the reader that a document contains suspicious statements, even if no already validated similar claims exist. For more popular claims, however, multiple results are found and displayed. We manage to achieve an F1 score of 98.03% and an accuracy of 98.1% in the binary fake news detection task using a fine-tuned DistilBERT model. © 2022 Copyright for this paper by its authors.

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL